🚀 Discover the power of GPU computing to accelerate your research on UCLA’s Hoffman2 cluster! This beginner-friendly workshop will guide you through the basics of GPU utilization, enhancing your projects with cutting-edge computational efficiency. ⭐
👉 What you’ll learn:
For suggestions: cpeterson@oarc.ucla.edu
This presentation and accompanying materials are available on 🔗 UCLA OARC GitHub Repository
You can view the slides in:
Note: 🛠️ This presentation was built using Quarto and RStudio.
🚀 In the mid-2000s, GPUs began to be used for non-graphical computations. NVIDIA introduced CUDA, a programming language that allows for compiling non-graphic programs on GPUs, spearheading the era of General-Purpose GPU (GPGPU).
GeForce 256
A100
GPUs are ubiquitous and found in devices ranging from PCs to mobile phones, and gaming consoles like Xbox and PlayStation.
Though initially designed for graphics, GPUs are now used in a wide range of applications.
picture source GROMACS
The significant speedup offered by GPUs comes from their ability to parallelize operations over thousands of cores, unlike traditional CPUs.
picture source NVIDIA
picture source NVIDIA
There are multiple GPU types available in the cluster. Each GPU has a different compute capability, memory size, and clock speed.
| GPU type | # CUDA cores | VMem | SGE option |
|---|---|---|---|
| NVIDIA A100 | 6912 | 80 GB | -l gpu,A100,cuda=1 |
| Tesla V100 | 5120 | 32 GB | -l gpu,V100,cuda=1 |
| RTX 2080 Ti | 4352 | 10 GB | -l gpu,RTX2080Ti,cuda=1 |
| Tesla P4 | 2560 | 8 GB | -l gpu,P4,cuda=1 |
Warning
When you using the -l gpu option, this only reserves the GPU for your job.
You will still need to use GPU optimized software and libraries to take advantage of the GPU’s parallel processing power.
The following sections will cover how to compile and run GPU optimized code on Hoffman2.
CUDA (Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) model from NVIDIA. It enables developers to write software that harnesses the power of GPUs for more than just graphics — expanding into high-performance computing and deep learning.
On Hoffman2, you can compile CUDA code by loading the cuda module. This prepares your environment with tools from the CUDA toolkit, which includes essential libraries and compilers for GPU code execution.
picture source NVIDIA
Here’s a simple CUDA code example that performs matrix multiplication (1024x1024):
MatrixMult folder
Matrix-cpu.cpp contains CPU (serial) codeMatrix-gpu.cu contains the CUDA codeMatrixMult.job job submission fileBe on the lookout for GPU optimized software for your research!
Other GPU platforms include:
There are several Python and R packages that use GPUs for varsious data-intensives tasks, like Machine Learning, Deep Learning, and large-scale data processing.
Python:
R:
Installing TensorFlow and PyTorch on Hoffman2 is straightforward using the Anaconda package manager. (Check out my Workshop on using Anaconda)
Create a new conda environmnet with CUDA tools.
Install TensorFlow/PyTorch with GPU support and the NVIDIA libraries
Verify the TensorFlow installation. Will only work if you are on a GPU-enabled node.
# TensofFlow Test:
python -c "import tensorflow as tf; print('TensorFlow is using:', ('GPU: ' + tf.test.gpu_device_name()) if tf.test.is_gpu_available() else 'CPU')"
# PyTorch Test:
python -c "import torch; print('PyTorch is using:', ('GPU: ' + torch.cuda.get_device_name(0)) if torch.cuda.is_available() else 'CPU')"Explore machine learning with the “Fashion MNIST” dataset using TensorFlow:
Approach:
Dataset Overview:
Now that we have TensorFlow installed, we can run some examples to test the GPU acceleration.
Files in the TF-Torch folder contain examples of using TensorFlow on Hoffman2.
Set up your TensorFlow environment
This approach provides a hands-on way to see the difference in performance when using GPUs compared to CPUs for training machine learning models.
DNA Sequence Classification with PyTorch
We will use RAPIDS for genomic data analysis. RAPIDS is a popular platform to run data workflows, tasks, and manipulations, as well as, machine learning on GPUs.
We will
Lets add Rapids to our environment
Navigate GPU-accelerated data manipulation with cuDF:
Files in the rapids folder
rapids_analysis-gpu.py - GPU versionrapids_analysis-cpu.py - CPU versionThe rapid_analysis.job will submit the job to the Hoffman2 cluster.
In this file, the line #$ -l gpu,V100 will submit this job to the V100 GPU nodes.
Explore machine learning with H2O.ai using the Combined Cycle Power Plant dataset:
We will use R and install the H2O.ai package to run the example.
In the h2oai folder, the h2oaiXGBoost.R script the code to run XGBoost on the Combined Cycle Power Plant dataset.
The H2O.ai functions will automatically detect the GPU and use it for training.
Hoffman2 has the resources and tools to help you leverage the power of GPUs for your research. ⭐
Main Takeaways:
-l gpu option to reserve a GPU nodeThis is an experimental setup that I made that can run both RStudio and Jupyter on Hoffman2.
This environment has many loaded packages (mostly data science related)
A lot of these packages are optimized with Intel’s OneAPI with MKL and GPU support
This is built using Docker and can be ran on any system with Apptainer
This is a pretty large container so it may some time to download (I already have it on Hoffman2). I’m working on some minimal versions without the many packages and well as non-GPU versions.
Warning
This is still a work in progress
This RStudio has TensorFlow and Torch for R installed with GPU support and MKL as well as many data science related R packages.
You can also run Python within this. Same Python as with Jupyter.
mkdir -pv $SCRATCH/rstudiotmp/var/lib
mkdir -pv $SCRATCH/rstudiotmp/var/run
mkdir -pv $SCRATCH/rstudiotmp/tmpapptainer run --nv \
-B $SCRATCH/rstudiotmp/var/lib:/var/lib/rstudio-server \
-B $SCRATCH/rstudiotmp/var/run:/var/run/rstudio-server \
-B $SCRATCH/rstudiotmp/tmp:/tmp \
$H2_CONTAINER_LOC/rpylab_rpylab-R4.3.3-python-3.10.10-oneapi-gpu.sif rstudioThis Jupyter also has TensorFlow and PyTorch installed with GPU support and MKL. There is also a R kernel in this Jupyter (same R from RStudio).
apptainer run --nv \
$H2_CONTAINER_LOC/rpylab_rpylab-R4.3.3-python-3.10.10-oneapi-gpu.sif Rscript myscript.R